Goto

Collaborating Authors

 side effect


New cancer tech sends chemo straight to tumors

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG . Apple's $250M Siri settlement: Are you owed cash?


China Approves the First Brain Chips for Sale--and Has a Plan to Dominate the Industry

WIRED

While the United States and Europe are moving cautiously forward with clinical trials, China is racing toward the commercialization of brain implants. China has made history by becoming the first nation to approve a commercially available brain chip to treat a disability. NEO, the implant developed by Neuracle Medical Technology, translates the thoughts of a person with paralysis into movements of an assistive robotic hand. After 18 months of testing that proved its safety, China's National Medical Products Administration authorized the implant for people aged 19 to 60 with paralysis caused by neck or spinal cord injuries that prevent them from moving their limbs. According Nature, the implant embedded in the skull is about the size of a coin.





f50a6c02a3fc5a3a5d4d9391f05f3efc-Paper.pdf

Neural Information Processing Systems

Intoyenvironments, Attainable Utility Preservation (AUP)avoided side effects by penalizing shifts in the ability to achieve randomly generated goals [22]. We scale this approach to large, randomly generated environments based onConway'sGame ofLife.



FDA Approves Pill Version of Wegovy

WIRED

Novo Nordisk's semaglutide will soon be available in a daily pill Americans can take for weight loss. The US Food and Drug Administration today approved a pill version of the blockbuster anti-obesity drug Wegovy. Made by Novo Nordisk, the pill is taken once a day. The company's original version of Wegovy is a weekly injection. Both drugs contain the same active ingredient, semaglutide.


Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability

arXiv.org Artificial Intelligence

Large language models (LLMs) have been shown to internalize human-like biases during finetuning, yet the mechanisms by which these biases manifest remain unclear. In this work, we investigated whether the well-known Knobe effect, a moral bias in intentionality judgements, emerges in finetuned LLMs and whether it can be traced back to specific components of the model. We conducted a Layer-Patching analysis across 3 open-weights LLMs and demonstrated that the bias is not only learned during finetuning but also localized in a specific set of layers. Surprisingly, we found that patching activations from the corresponding pretrained model into just a few critical layers is sufficient to eliminate the effect. Our findings offer new evidence that social biases in LLMs can be interpreted, localized, and mitigated through targeted interventions, without the need for model retraining.